计算机视觉和机器学习的进步使机器人能够以强大的新方式感知其周围环境,但是这些感知模块具有众所周知的脆弱性。我们考虑了合成尽管有知觉错误的安全控制器的问题。所提出的方法基于具有输入依赖性噪声的高斯过程构建状态估计器。该估计器为给定状态计算实际状态的高信心集。然后,合成了可证明可以处理状态不确定性的强大神经网络控制器。此外,提出了一种自适应采样算法来共同改善估计器和控制器。模拟实验,包括Carla中基于逼真的巷道示例,说明了提出方法在与基于深度学习的感知合成强大控制器中提出的方法的希望。
translated by 谷歌翻译
在使用不同的培训环境展示时,获得机器学习任务的可推广解决方案的一种方法是找到数据的\ textit {不变表示}。这些是协变量的表示形式,以至于表示形式的最佳模型在培训环境之间是不变的。在线性结构方程模型(SEMS)的背景下,不变表示可能使我们能够以分布范围的保证(即SEM中的干预措施都有牢固的模型学习模型。为了解决{\ em有限示例}设置中不变的表示问题,我们考虑$ \ epsilon $ approximate不变性的概念。我们研究以下问题:如果表示给定数量的培训干预措施大致相当不变,那么在更大的看不见的SEMS集合中,它是否会继续大致不变?这种较大的SEM集合是通过参数化的干预措施来生成的。受PAC学习的启发,我们获得了有限样本的分布概括,保证了近似不变性,该概述\ textit {概率}在没有忠实假设的线性SEMS家族上。我们的结果表明,当干预站点仅限于恒定大小的子集的恒定限制节点的恒定子集时,界限不会在环境维度上扩展。我们还展示了如何将结果扩展到结合潜在变量的线性间接观察模型。
translated by 谷歌翻译
我们研究了Adagrad-norm的收敛速率,作为自适应随机梯度方法(SGD)的典范,其中,基于观察到的随机梯度的步骤大小变化,以最大程度地减少非凸,平稳的目标。尽管它们很受欢迎,但在这种情况下,对自适应SGD的分析滞后于非自适应方法。具体而言,所有先前的作品都依赖以下假设的某个子集:(i)统一结合的梯度规范,(ii)均匀遇到的随机梯度方差(甚至噪声支持),(iii)步骤大小和随机性之间的有条件独立性坡度。在这项工作中,我们表明Adagrad-norm表现出$ \ Mathcal {O} \ left(\ frac {\ mathrm {poly} \ log(t)} {\ sqrt {\ sqrt {t}}} \ right)的订单最佳收敛率$在$ t $迭代之后,在与最佳调整的非自适应SGD(无界梯度规范和仿射噪声方差缩放)相同的假设下进行了$,而无需任何调整参数。因此,我们确定自适应梯度方法在比以前了解的更广泛的方案中表现出最佳的融合。
translated by 谷歌翻译
我们提出了两种线性土匪算法,具有每步复杂性sublerear的武器$ k $。该算法专为手臂集非常大且缓慢变化的应用而设计。我们的关键意识到,选择手臂还原为最大的内部产品搜索(MIPS)问题,该问题可以大约解决,而无需打破后悔保证。现有的近似MIPS求解器以均匀时间运行。我们扩展了这些求解器,并为在线学习问题提供理论保证,在线学习问题(即,以后的步骤取决于上一步中的反馈)成为一个独特的挑战。然后,我们明确表征了每步复杂性与遗憾之间的权衡。对于足够大的$ k $,我们的算法具有sublinear每步复杂性和$ \ tilde o(\ sqrt {t})$遗憾。从经验上讲,我们在合成环境和现实世界中的电影推荐问题中评估了我们提出的算法。与线性时间基线相比,我们提出的算法可以提供超过72倍的速度,同时保留了类似的遗憾。
translated by 谷歌翻译
模型不足的元学习(MAML)已越来越流行,对于可以通过一个或几个随机梯度下降步骤迅速适应新任务的训练模型。但是,与标准的非自适应学习(NAL)相比,MAML目标更难优化,并且几乎没有理解MAML在各种情况下的溶液的快速适应性方面的改善。我们通过线性回归设置进行分析解决此问题,该设置由简单而艰难的任务组成,其中硬度与梯度下降在任务上收敛的速率有关。具体而言,我们证明,为了使MAML比NAL获得可观的收益,(i)任务之间的硬度必须有一定的差异,并且(ii)艰苦任务的最佳解决方案必须与中心远离远离中心。简单任务最佳解决方案的中心。我们还提供数值和分析结果,表明这些见解适用于两层神经网络。最后,我们提供了很少的图像分类实验,可以支持我们何时使用MAML的见解,并强调培训MAML对实践中的艰巨任务的重要性。
translated by 谷歌翻译
Data compression is becoming critical for storing scientific data because many scientific applications need to store large amounts of data and post process this data for scientific discovery. Unlike image and video compression algorithms that limit errors to primary data, scientists require compression techniques that accurately preserve derived quantities of interest (QoIs). This paper presents a physics-informed compression technique implemented as an end-to-end, scalable, GPU-based pipeline for data compression that addresses this requirement. Our hybrid compression technique combines machine learning techniques and standard compression methods. Specifically, we combine an autoencoder, an error-bounded lossy compressor to provide guarantees on raw data error, and a constraint satisfaction post-processing step to preserve the QoIs within a minimal error (generally less than floating point error). The effectiveness of the data compression pipeline is demonstrated by compressing nuclear fusion simulation data generated by a large-scale fusion code, XGC, which produces hundreds of terabytes of data in a single day. Our approach works within the ADIOS framework and results in compression by a factor of more than 150 while requiring only a few percent of the computational resources necessary for generating the data, making the overall approach highly effective for practical scenarios.
translated by 谷歌翻译
We propose a novel model agnostic data-driven reliability analysis framework for time-dependent reliability analysis. The proposed approach -- referred to as MAntRA -- combines interpretable machine learning, Bayesian statistics, and identifying stochastic dynamic equation to evaluate reliability of stochastically-excited dynamical systems for which the governing physics is \textit{apriori} unknown. A two-stage approach is adopted: in the first stage, an efficient variational Bayesian equation discovery algorithm is developed to determine the governing physics of an underlying stochastic differential equation (SDE) from measured output data. The developed algorithm is efficient and accounts for epistemic uncertainty due to limited and noisy data, and aleatoric uncertainty because of environmental effect and external excitation. In the second stage, the discovered SDE is solved using a stochastic integration scheme and the probability failure is computed. The efficacy of the proposed approach is illustrated on three numerical examples. The results obtained indicate the possible application of the proposed approach for reliability analysis of in-situ and heritage structures from on-site measurements.
translated by 谷歌翻译
We present DyFOS, an active perception method that Dynamically Finds Optimal States to minimize localization error while avoiding obstacles and occlusions. We consider the scenario where a ground target without any exteroceptive sensors must rely on an aerial observer for pose and uncertainty estimates to localize itself along an obstacle-filled path. The observer uses a downward-facing camera to estimate the target's pose and uncertainty. However, the pose uncertainty is a function of the states of the observer, target, and surrounding environment. To find an optimal state that minimizes the target's localization uncertainty, DyFOS uses a localization error prediction pipeline in an optimization search. Given the states mentioned above, the pipeline predicts the target's localization uncertainty with the help of a trained, complex state-dependent sensor measurement model (which is a probabilistic neural network in our case). Our pipeline also predicts target occlusion and obstacle collision to remove undesirable observer states. The output of the optimization search is an optimal observer state that minimizes target localization uncertainty while avoiding occlusion and collision. We evaluate the proposed method using numerical and simulated (Gazebo) experiments. Our results show that DyFOS is almost 100x faster than yet as good as brute force. Furthermore, DyFOS yielded lower localization errors than random and heuristic searches.
translated by 谷歌翻译
Adversarial training has been empirically shown to be more prone to overfitting than standard training. The exact underlying reasons still need to be fully understood. In this paper, we identify one cause of overfitting related to current practices of generating adversarial samples from misclassified samples. To address this, we propose an alternative approach that leverages the misclassified samples to mitigate the overfitting problem. We show that our approach achieves better generalization while having comparable robustness to state-of-the-art adversarial training methods on a wide range of computer vision, natural language processing, and tabular tasks.
translated by 谷歌翻译
Adversarial training is widely acknowledged as the most effective defense against adversarial attacks. However, it is also well established that achieving both robustness and generalization in adversarially trained models involves a trade-off. The goal of this work is to provide an in depth comparison of different approaches for adversarial training in language models. Specifically, we study the effect of pre-training data augmentation as well as training time input perturbations vs. embedding space perturbations on the robustness and generalization of BERT-like language models. Our findings suggest that better robustness can be achieved by pre-training data augmentation or by training with input space perturbation. However, training with embedding space perturbation significantly improves generalization. A linguistic correlation analysis of neurons of the learned models reveal that the improved generalization is due to `more specialized' neurons. To the best of our knowledge, this is the first work to carry out a deep qualitative analysis of different methods of generating adversarial examples in adversarial training of language models.
translated by 谷歌翻译